The Integrated Language Database of 8th - 21st-Century Dutch
نویسنده
چکیده
The Institute for Dutch Lexicology (INL) has a long-standing tradition in corpus-based lexicography. The results include electronic scholarly dictionaries of Dutch covering the vocabulary from 1200 up to 1976, linguistically annotated electronic text corpora of historical and present-day Dutch, and computational lexica. Added value to these data is given in an on-going long-term INL project, the Integrated Language Database of 8th–21st-Century Dutch (ILD). The aim is to create a flexible linguistic research instrument by linking the dictionaries, a balanced diachronic text corpus and lexica of historical and present-day Dutch. We will link part of our data with data collections stored at other institutes, creating a supra-institutional research instrument. The paper reports on the overall ILD design and the user's perspective. Focus is on the ILD prototype which, when finished, will function as a demonstration model to verify and assess user needs. It now functions to test the design empirically for its applicability to 'real data', as well as to obtain figures on workload, etc. The conclusion is that the latter function proved the prototype to be an indispensable pilot for the ILD.
منابع مشابه
Implementation and Evaluation of PAROLE PoS in a National Context
We are annotating the complete 20 million Dutch PAROLE corpus with PoS and lemma. The morphosyntactic tagging of 250,000 words during the PAROLE project was the first confrontation of the fine-grained Dutch PAROLE tagset and its ’functional’ mode of application, with real corpus data. The correction of the manual tagging and the compilation of a 100,000 words training corpus for the automatic t...
متن کاملPronunciation Barriers and Computer Assisted Language Learning (CALL): Coping the Demands of 21st Century in Second Language Learning Classroom in Pakistan
Pronunciation of English language is a very important sub-skill of speaking module in second language learning process. However, it is ignored, neglected, and even never gotten least attention by the teachers, administrators, and stakeholders especially in Pakistan. Grammar, vocabulary, and the other linguistic skills such as reading and writing are emphasized whereas pronunciation has never be...
متن کاملLanguage planning
Language planning, in one way or another, is as old as human civilization. Every time that one polity invaded the territory of another, the language of the conqueror was imposed on the conquered. The Romans imposed their language across the civilized world as they knew it. In the 21st century, the practice of language planning has become increasingly sophisticated. Eng...
متن کاملAn Exploration of Language Identification Techniques for the Dutch Folktale Database
The Dutch Folktale Database contains fairy tales, traditional legends, urban legends, and jokes written in a large variety and combination of languages including (Middle and 17th century) Dutch, Frisian and a number of Dutch dialects. In this work we compare a number of approaches to automatic language identification for this collection. We show that in comparison to typical language identifica...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004